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Abstract. This paper presents a chatbot for a Dialogue-Based Computer-Assisted 
second Language Learning (DB-CALL) system. A DB-CALL system normally 
leads dialogues by asking questions according to given scenarios. User utterances 
outside the scenarios are normally considered as semantically improper and simply 
rejected. In this paper, we assume that raising the freedom of dialogue can stimulate 
the user's interest in learning. For this, a chatbot based on a search engine with a 
dialogue corpus has been developed to deal with conversations out of the scenarios. 
We evaluate the chatbot separately in two different cases: as an independent hot 
and as an auxiliary system. The results showed that, unlike the independent chatbot 
system, the chatbot as an auxiliary system showed a much lower turn success ratio. 

Keywords: chatbot, computer-assisted second-language learning system, dialogue- 
based CALL, dialogue system. 


1. Introduction 

Dialogues between a user and a DB-CALL system normally need to follow given 
scenarios on chosen topics. It is a system that leads dialogues by asking questions. 
The language learner needs to answer the questions (Lee et al.. 2011). The system 
evaluates the answers to see if they are appropriate for the given question (Kwon 
et al., 2015). Such evaluation is totally based on the given scenarios and so 
utterances outside the scenarios would be considered as semantically improper 
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and rejected by the system. This means that even meaningful conversations are 
perceived as errors. 

In this paper, we present a DB-CALL system which adopts a chatbot to enable 
free conversations between the learner and the system. We also investigate what 
preparation a chatbot needs to assist a DB-CALL system. 


2. GenieTutor - a task-oriented dialogue 
system for second-language learning 

We developed GenieTutor, a DB-CALL system for English learners in Korea 
several years ago. At first, it was a role-play dialogue system for second language 
learners (Kwon et al.. 2015). After that, an upgraded version of GenieTutor was 
developed. Our goal was to increase the freedom of the user's conversation so that 
it would become more like a conversation between people. In order to achieve 
this, topics were considered as tasks, which could be separated into several smaller 
subtasks; the execution of some of the subtasks could, in turn, be independent 
of the orders. As a result, a certain degree of freedom in the order of utterances 
was allowed (Choi, Kwon, Kim, & Lee, 2016; Kwon, Kim, & Lee, 2016). For 
example, for a task of ‘ordering food’, which consisted of subtasks ‘[greeting] » 
choose main dishes > choose side dishes > pay the bill » [greeting]’, the sub-task 
[greeting] can be skipped, and the user can choose side dishes with the main dishes 
at once and then just ask for the bill. 

Despite a greater degree of freedom, user utterances could still be rejected - the 
reason might be a lack of keywords which are necessary for the user utterances, or 
the utterances might be outside the scenarios. In either case, the system would treat 
them as “unknown utterances”, and ask the user to re-utter their responses again. 
Here is an example - the system simply repeats its previous utterance of “Your total 
comes to 160 dollars”, when the user answers with “I don’t have money” which is 
an utterance outside the scenario: 

System: Your total comes to 160 dollars. 

User: I don’t have money. 

System: Your total comes to 160 dollars. 

User: Here you go. 
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We performed a user evaluation on this system, where a question was asked: “Do 
you think the out of topic conversation (free talking) function is necessary for a 
DB-CALL system?”. About 66.7% of 30 participants answered with “necessary” 
or “very necessary”. This percentage increases to 86.7% for intermediate level 
learners. Considering only 46.7% of the elementary level learners were positive, 
higher level learners prefer to learn language through free dialogue (Table 1). 


Table 1. User evaluation on GenieTutor: “Is out of topic conversation necessary?” 


Answer 

Elementary 
level learners 

Percentage 

Intermediate 

Level 

learners 

Percentage 

In total 

5: Very necessary 

5 

33.3% 

5 

33.3% 

33.3% 

4: Necessary 

2 

13.3% 

8 

53.3% 

33.3% 

3: Not sure 

3 

20% 

0 

0% 

10.0% 

2: Not necessary 

3 

20% 

2 

13.3% 

16.7% 

1: Very unnecessary 

2 

13.3% 

0 

0% 

6.7% 

In Total 

15 

100% 

15 

100% 

100% 


3. GenieTutor Plus - allows free 
conversations with chatbot 

To meet the user needs shown in Table 1 , a chatbot was considered necessary for 
the DB-CALL system. A search-based chatbot was developed to assist GenieTutor 
to allow users to have free conversations with the system. The dialogue is still 
mainly based on scenarios. However, if the semantic correctness evaluation module 
determines that the user utterance cannot be classified to any predefined dialogue 
acts, it would be considered as an out-of-task utterance, and responded to as such 
by the chatbot. 

The main purpose of the DB-CALL system is to help learners practice given 
dialogues. To fulfill this purpose, right after the chatbot response, the system would 
induce the user to return to the topic conversation by speaking in accordance with 
the scenario. For example, when a user presents “I have no money”, the system 
will utter “What a pity!” with the chatbot, and then repeat “Your total comes to 
160 dollars” according to the scenario (Figure 1). The sentences presented by the 
chatbot are highlighted in red. 
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Figure 1. GenieTutor with chatbot 



The search engine Indri (Strohman, Metzler, Turtle, & Croft, 2005) was adopted to 
retrieve the most similar dialogue examples from a dialogue corpus. Each dialogue 
example contains two utterances, called a turn in the dialogue system: a query 
uttered with a reply. As most of the dialogues consist of short sentences (which 
is different from document retrieval), a rescoring function was adopted to re-rank 
similar examples. In the case of a lack of a similar example, an utterance was 
randomly output to the user, which was supposed to be similar to a topic change in 
human conversation. 

About 410,000 turns are contained in the corpus, of which 18,000 were developed 
by human developers, 237,000 were extracted from the MovieDic corpus (Banchs, 
2012), and another 155,000 were collected from various educational or traveling 
materials, which have been developed over the recent decades for dialogue machine 
translation purposes. 


4. User evaluation 

Firstly, the chatbot was evaluated as an assistant module of GenieTutor Plus. The 
learners were required to have conversations with the system. The users were 
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allowed to enjoy free talking with the system (non-topic related conversation), 
while all subtasks had to be finished to achieve the task. The turn success ratio 
was evaluated as the ratio of system responses which were evaluated as ‘proper’ 
to user utterances. Twenty English learners participated in the evaluation on topics 
including ‘buying city-tour tickets’ and ‘ordering food’. An evaluation on the 
independent chatbot was also performed. The users were required to chat with the 
chatbot freely, with at least twenty turns being uttered. 


Table 2. User evaluation on the chatbot as an auxiliary system of DB-CALL 
versus as an independent system 


Topic 

Turn success ratio 
(topic and 
non-topic) 

Non-topic 
user utterances 

Turn success 
ratio 

(non-topic) 

Chatbot for GenieTutor Plus 

71.30% 

8.38% 

33.33% 

Independent chatbot 

- 

100% 

52.78% 


From Table 2, we can see that, compared with an independent bot, the chatbot has 
a lower success ratio as an assistant bot for a DB-CALL system. The reason is that 
users tended to evaluate a non-topic response in the context of the topic conversation 
on a more stringent basis. For example, the following system utterances would be 
acceptable if they were uttered by an independent chatbot, but would be considered 
as improper if they happened during a food ordering task, in which the DB-CALL 
system acted in the role of waiter: 

System: Woidd you like something to drink? 

User: Nothing. 

System: It is a damned ugly nothing. 


5. Conclusion 

In this paper we introduced a chatbot to a DB-CALL system to deal with out 
of topic user utterances, so that the conversation could be more natural, like a 
conversation between people. However, the turn success ratio of such free-talking 
in a DB-CALL system was lower than with an independent chatbot. We would like 
to continue our research to extract small but more suitable dialogue corpus for each 
topic in the DB-CALL system. 
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